Goto

Collaborating Authors

 multi-world approach


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","879" "Title:","A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors present a method for question answering about real world scenes - given as input a real world image and a question regarding objects in this image their system answers this question. For the question-answering engine the authors have generated a novel dataset with more than 12k question-answer pairs. The authors show an improved performance when using the multi-world approach but it didn't fully convinced me as for its quality since the accuracy (and WUPS) is pretty low either way. I would like to see more evidence and understanding of the importance and contribution of the multi-world approach.



A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

Mateusz Malinowski, Mario Fritz

Neural Information Processing Systems

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision. We combine discrete reasoning with uncertain predictions by a multiworld approach that represents uncertainty about the perceived world in a bayesian framework. Our approach can handle human questions of high complexity about realistic scenes and replies with range of answer like counts, object classes, instances and lists of them. The system is directly trained from question-answer pairs. We establish a first benchmark for this task that can be seen as a modern attempt at a visual turing test.


A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

Malinowski, Mateusz, Fritz, Mario

Neural Information Processing Systems

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision. We combine discrete reasoning with uncertain predictions by a multi-world approach that represents uncertainty about the perceived world in a bayesian framework. Our approach can handle human questions of high complexity about realistic scenes and replies with range of answer like counts, object classes, instances and lists of them. The system is directly trained from question-answer pairs. We establish a first benchmark for this task that can be seen as a modern attempt at a visual turing test.


A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

Malinowski, Mateusz, Fritz, Mario

Neural Information Processing Systems

We propose a method for automatically answering questions about images by bringing together recent advances from natural language processing and computer vision. We combine discrete reasoning with uncertain predictions by a multi-world approach that represents uncertainty about the perceived world in a bayesian framework. Our approach can handle human questions of high complexity about realistic scenes and replies with range of answer like counts, object classes, instances and lists of them. The system is directly trained from question-answer pairs. We establish a first benchmark for this task that can be seen as a modern attempt at a visual turing test.